Search CORE

51 research outputs found

Relaxations for inference in restricted Boltzmann machines

Author: Frostig Roy
Liang Percy
Manning Christopher D.
Wang Sida I.
Publication venue
Publication date: 02/01/2014
Field of study

We propose a relaxation-based approximate inference algorithm that samples near-MAP configurations of a binary pairwise Markov random field. We experiment on MAP inference tasks in several restricted Boltzmann machines. We also use our underlying sampler to estimate the log-partition function of restricted Boltzmann machines and compare against other sampling-based methods.Comment: ICLR 2014 workshop track submissio

arXiv.org e-Print Archive

CiteSeerX

Naturalizing a Programming Language via Interactive Learning

Author: Ginn Samuel
Liang Percy
Manning Christoper D.
Wang Sida I.
Publication venue
Publication date: 01/01/2017
Field of study

Our goal is to create a convenient natural language interface for performing well-specified but complex actions such as analyzing data, manipulating text, and querying databases. However, existing natural language interfaces for such tasks are quite primitive compared to the power one wields with a programming language. To bridge this gap, we start with a core programming language and allow users to "naturalize" the core language incrementally by defining alternative, more natural syntax and increasingly complex concepts in terms of compositions of simpler ones. In a voxel world, we show that a community of users can simultaneously teach a common system a diverse language and use it to build hundreds of complex voxel structures. Over the course of three days, these users went from using only the core language to using the naturalized language in 85.9\% of the last 10K utterances.Comment: 10 pages, ACL201

arXiv.org e-Print Archive

Crossref

Simple Recurrent Units for Highly Parallelizable Recurrence

Author: Artzi Yoav
Dai Hui
Lei Tao
Wang Sida I.
Zhang Yu
Publication venue
Publication date: 01/01/2018
Field of study

Common recurrent neural architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations. In this work, we propose the Simple Recurrent Unit (SRU), a light recurrent unit that balances model capacity and scalability. SRU is designed to provide expressive recurrence, enable highly parallelized implementation, and comes with careful initialization to facilitate training of deep models. We demonstrate the effectiveness of SRU on multiple NLP tasks. SRU achieves 5--9x speed-up over cuDNN-optimized LSTM on classification and question answering datasets, and delivers stronger results than LSTM and convolutional models. We also obtain an average of 0.7 BLEU improvement over the Transformer model on translation by incorporating SRU into the architecture.Comment: EMNL

arXiv.org e-Print Archive

Crossref

Natural Language to Code Translation with Execution

Author: Fried Daniel
Ghazvininejad Marjan
Shi Freda
Wang Sida I.
Zettlemoyer Luke
Publication venue
Publication date: 01/11/2022
Field of study

Generative models of code, pretrained on large corpora of programs, have shown great success in translating natural language to code (Chen et al., 2021; Austin et al., 2021; Li et al., 2022, inter alia). While these models do not explicitly incorporate program semantics (i.e., execution results) during training, they are able to generate correct solutions for many problems. However, choosing a single correct program from a generated set for each problem remains challenging. In this work, we introduce execution result--based minimum Bayes risk decoding (MBR-EXEC) for program selection and show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks. We select output programs from a generated candidate set by marginalizing over program implementations that share the same semantics. Because exact equivalence is intractable, we execute each program on a small number of test inputs to approximate semantic equivalence. Across datasets, execution or simulated execution significantly outperforms the methods that do not involve program semantics. We find that MBR-EXEC consistently improves over all execution-unaware selection methods, suggesting it as an effective approach for natural language to code translation. We open-source our code at github.com/facebookresearch/mbr-exec and data at dl.fbaipublicfiles.com/mbr-exec/mbr-exec-release.zipComment: EMNLP 202

arXiv.org e-Print Archive

LEVER: Learning to Verify Language-to-Code Generation with Execution

Author: Iyer Srini
Lin Xi Victoria
Ni Ansong
Radev Dragomir
Stoyanov Ves
Wang Sida I.
Yih Wen-tau
Publication venue
Publication date: 16/02/2023
Field of study

The advent of pre-trained code language models (CodeLMs) has lead to significant progress in language-to-code generation. State-of-the-art approaches in this area combine CodeLM decoding with sample pruning and reranking using test cases or heuristics based on the execution results. However, it is challenging to obtain test cases for many real-world language-to-code applications, and heuristics cannot well capture the semantic features of the execution results, such as data type and value range, which often indicates the correctness of the program. In this work, we propose LEVER, a simple approach to improve language-to-code generation by learning to verify the generated programs with their execution results. Specifically, we train verifiers to determine whether a program sampled from the CodeLM is correct or not based on the natural language input, the program itself and its execution results. The sampled programs are reranked by combining the verification score with the CodeLM generation probability, and marginalizing over programs with the same execution results. On four datasets across the domains of table QA, math QA and basic Python programming, LEVER consistently improves over the base CodeLMs (4.6% to 10.9% with code-davinci-002) and achieves new state-of-the-art results on all of them.Comment: 23 page

arXiv.org e-Print Archive

Lassie: HOL4 Tactics by Example

Author: Bancerek Grzegorz
Bansal Kshitij
Becker Heiko
Berant Jonathan
Blanchette Jasmin Christian
Cezary Kaliszyk Czajka
Corbineau Pierre
Coscoy Yann
de Moura Leonardo Mendonça
Delahaye David
Frerix Stefen
Ganesalingam Mohan
Gonthier Georges
Hallgren Thomas
Harrison John
Huang Daniel
Kaiser Jan-Oliver
Kaliszyk Cezary
Klein Gerwin
Lawrence
Matichuk Daniel
Nipkow Tobias
Ranta Aarne
Schulz Stephan
Slind Konrad
Solovyev Alexey
Ullrich Sebastian
Wang Sida I.
Wenzel Markus
Wenzel Markus
Yang Kaiyu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

Proof engineering efforts using interactive theorem proving have yielded several impressive projects in software systems and mathematics. A key obstacle to such efforts is the requirement that the domain expert is also an expert in the low-level details in constructing the proof in a theorem prover. In particular, the user needs to select a sequence of tactics that lead to a successful proof, a task that in general requires knowledge of the exact names and use of a large set of tactics. We present Lassie, a tactic framework for the HOL4 theorem prover that allows individual users to define their own tactic language by example and give frequently used tactics or tactic combinations easier-to-remember names. The core of Lassie is an extensible semantic parser, which allows the user to interactively extend the tactic language through a process of definitional generalization. Defining tactics in Lassie thus does not require any knowledge in implementing custom tactics, while proofs written in Lassie retain the correctness guarantees provided by the HOL4 system. We show through case studies how Lassie can be used in small and larger proofs by novice and more experienced interactive theorem prover users, and how we envision it to ease the learning curve in a HOL4 tutorial

arXiv.org e-Print Archive

Crossref

MPG.PuRe